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ABSTRACT 


Several  miscellaneous  results  are  given  in  the  theory  of  digital  codes  and  finite 
state  machines.  Of  particular  interest  in  either  case  is  the  problem  of  ergodicity,  i.  e. 
whether  a  machine  (which  may  be  a  decoder)  can  be  reset  to  the  correct  state  with  the 
proper  input  sequence  and  with  no  knowledge  of  the  present  states  or  outputs.  Some 
of  the  results  given  are  algebraic  methods  and  other  tools  useful  in  examining  ergodicity 
and  related  problems.  Consideration  is  also  given  to  the  synthesis  of  sequential  machines 
to  economically  solve  tasks  which  are  essentially  non- sequential. 
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Purpose 

The  present  contract  has  as  its  aim  a  study  of  certain  properties  of  digital  codes 
used  for  communication*  both  in  general  and  as  applied  to  Signal  Corps  porblems.  Some 
consideration  is  also  given  to  finite  state  machines  since  they  are  necessary  for  encod¬ 
ing  and  decoding  operations,  and  since  they  are  analyzed  by  essentially  the  same  mathe¬ 
matical  tools. 


Factual  Data 

As  in  the  previous  report,  the  work  to  be  reported  here  will  be  grouped  under  three 
general  headings: 

1.  Synthesis  of  coding  and  other  digital  apparatus 

2.  Further  results  on  the  symbol  string  algebra 

3.  Topics  concerning  digital  codes. 
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1. 1  Turing  Machines  as  Stored  Program  Computers  It  ia  desired  to  study  very  simple 
stored  program  computers  because  these  represent  economical  ways  to  synthesize  dig¬ 
ital  transducers.  A  storage  element  is  usually  cheaper  than  a  logic  element  (diode  etc. ), 
especially  if  the  latter  must  be  individually  wired.  Of  course,  a  stored  program  com¬ 
puter  has  the  additional  advantage  that  it  can  be  programmed  to  perform  a  different  task, 
but  if  the  stored  program  transducer  is  cheap  enough  it  can  be  economically  used  for  a 
specific  application.  Instances  already  exist  where  people  have  purchased  general  pur¬ 
pose  computers  to  accomplish  specific  tasks  rather  than  design  and  built  their  own  appar¬ 
atus.  A  method  for  analyzing  and  synthesizing  computers  was  given  in  the  last  report''’. 
The  basic  form  was  shown  in  Fig.  9^,  and  the  timing  cycles  are  shown  in  Table  1^. 

The  purpose  here  is  to  throw  as  much  of  the  burden  of  computation  as  is  possible  on  the 
memory  unit.  A  considerable  effort  has  been  expended  by  various  workers  in  trying  to 
simplify  the  universal  Turing  machine,  and  the  resulting  designs  can  be  useful  as  boundary 
conditions  on  practical  transducers.  These  Turing  machine  designs  are  attempts  to  re¬ 
duce  the  product  of  member  of  "tape  symbols"  and  "head  states",  thin  product  being 
suggested  by  Shannon^  as  a  suitable  measure  of  complexity.  How  is  the  product  of 
tape  symbols  and  head  states  related  tc  questions  of  machine  operations  vs.  programmed 
operations,  optimum  word  length,  etc.  ? 

A  computer  will  now  be  described  which  resembles  a  Turing  machine  except  that 
a  finite  tape  will  be  assumed.  The  timing  table  and  block  diagram  are  shown  in  Fig.  1. 1. 
At  time  1  the  tape  is  read  and  the  main  calculation  is  made:  the  new  head  state  z,  the 
next  tape  symbol  to  be  printed  r,  and  the  tape  displacement  d  are  calculated  as  functions 
of  the  old  head  state  x  and  the  tape  symbol  at  a,  t(a).  At  time  2  the  new  symbol  s  is 
printed  at  position  a  on  the  tape  and  the  present  tape  position  is  shifted  to  register  I. 

At  time  3  the  new  head  state  z  is  put  in  the  head  store  X;  and  the  new  tape  position  g  is 
calculated  from  the  old  position  "a"  and  the  displacement  d,  thus  shifting  the  tape.  The 
shift  in  a  real  Turing  machine  is  either  1  right  or  1  left  (address  increased  or  decreased 
by  1):  but  here,  since  the  tape  is  finite,  more  general  shifts  could  be  allowed,  even  to 
any  postiion  on  the  tape  (i.  e.  any  address).  In  a  theoretical  Turing  machine  some  of  the 
details  given  in  Fig.  1.1  are  of  no  interest,  i.  e.  the  clock,  the  temporary  storage,  the 
tape  position  register  and  tape  shifter.  Some  of  these  could  be  eliminated  by  postulating 
suitable  delays,  eg.  if  the  Operation  Table  has  a  delayed  output  the  tape  could  be  read 
and  written  one  in  one  part  of  the  cycle.  Most  Turing  machine  theory,  being  concerned 
with  computability  and  not  practical  machine  design,  concentrates  on  the  number  of  head 
states  n  and  the  number  of  tape  symbols  m.  As  shown  in  Fig.  1. 1,  the  Head  State  regis¬ 
ter  will  have  log2  n  bits  of  storage  capacity  (and  that  many  pairs  in  the  horizontal  bus 
leaving  it),  and  the  tape  output  vertical  bus  will  have  log2  m  pairs.  The  Operation 
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Table  thus  has  log^  ran  pairs  (Boolean  variables  and  their  complements)  entering  it,  not 

counting  *  c"  which  is  merely  a  timing  pulse  from  the  clock.  The  Operation  Table,  if 

written  out  explicitly,  would  have  mn  entries,  each  specifying  1  +  log,  mn  bits.  Using 

L  (2) 

Shannon' s  method  for  constructing  2  state  or  2  symbol  Turing  machines'  and  starting 

(4) 

from  Minsky's  machine'  '  of  6  symbols  and  7  states,  the  following  possibilities  exist: 


n  states 

174 

7 

2 


These  effectively  specify  the  number  of  Boolean  variables  entering  the  Operation  Table 
as  between  6  and  9  binary  digits.  Some  thoughts  on  the  design  of  such  multiple  output 
combinational  circuits  were  given  in  Section  1.  2  of  the  previous  report^.  The  parameter 
log^  m  in  a  Turing  machine  corresponds  to  word  length  in  a  computer.  The  parameter 
n  in  a  Turing  machine  does  not  correspond  to  a  single  parameter  in  a  computer,  but  it 
does  represent  the  number  of  operations  and  the  accumulator  size  together. 

Some  interesting  comparison  can  now  be  drawn  between  the  Turing  machine  of 
Fig.  1. 1  and  the  computer  of  Fig.  9^.  Words  in  a  single  address  computer  can  be 
broadly  divided  into  three  types:  ope  ration- data  address,  operation-instruction  address 
(Jumps),  and  data  words  (numbers  etc.  ).  Instructions  are  sometimes  broken  down  into 
more  than  two  fields,  eg.  operation-index  register-address)  and  the  operation  sometimes 
spills  over  into  the  address  field,  eg.  in  shift  instructions.  The  address  may  be  absolute 
(directly  referring  to  any  part  of  the  memory)  or  relative  (specifying  the  memory  location 
by  its  distance  from  the  present  instruction).  The  computer  may  have  many  more  oper¬ 
ations  than  the  operation  field  indicates,  since  some  can  be  performed  by  subroutines. 

In  the  extreme  case  of  a  Turing  machine,  the  word  (tape  symbol)  cannot  be  broken  down 
into  separate  fields  at  all,  and  the  address  is  so  *  realtive"  that  only  the  previous  or 
next  locations  can  be  addressed.  It  is  desired  to  study  devices  halfway  between  a  Tur¬ 
ing  machine  and  a  single  address  computer. 

The  relationship  of  a  Turing  machine  to  various  types  of  digital  computer  is  shown 
the  hierarchy  of  Table  1. 1.  The  1,  2,  3,  and  4  address  machines  select  their  instructions 
from  successive  storage  locations  (except  for  possible  jumps  to  any  part  of  storage)  and 
their  data  from  arbitrary  storage  locations  given  by  the  address  part  of  the  instruction 
word.  The  1+1  address  machine  jumps  to  an  arbitrary  location  for  the  next  instruction 
after  every  step,  but  this  seems  to  be  of  no  great  advantage  except  where  a  cyclic  stor¬ 
age  medium  is  used^\  Now  it  is  quite  possible  to  imagine  a  single  address  machine  to 
be  modified  so  that  the  data  is  part  of  the  instruction  word  rather  than  merely  the 
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TABLE  1. 1 


Data  Adresses 


0 

1 

2 

3 

Instruc¬ 
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0 

Turing 

IBM  709 

Univac  1103 

Honeywell  800 
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IBM  650 
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TABLE  1.  2 
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(4) 

address  of  the  data  '.  If  the  same  datumenters  several  instructions  this  might  prove 
inconvenient  ,  but  it  is  conceivable  that  the  data  would  be  shorter  than  the  address 
(obviously  true  for  an  infinite  tape  storage).  This  suggests  the  "O-address*  computer. 

A  *  partial  address*  is  possible  as  an  intermediate  step  between  full  address  and  no  a 

(5) 

address;  eg.  in  the  basic  CDC  160  '  has  8192  storage  locations  and  a  6  bit  address,  rel¬ 

ative  addresses  being  used  for  most  operations.  Carried  to  an  extreme, the  relative 
address  becomes  the  left  or  right  shift  of  the  Turing  machine  tape.  Specifications  for 
computers  mentioned  were  obtained  from  references  4  to  8. 

An  interesting  possibility  is  suggested  by  Table  1.  2  in  the  box  labeled  *  Step  Dat.i"  . 
This  would  be  the  opposite  of  a  l-address  machine.  A  l-address  machine  steps  through 
the  instructions  while  pulling  data  from  various  locations  specified  by  the  address.  A 
•Step  Data"  computer  would  step  through  the  data  while  pulling  instruction  from  various 
locations  specified  by  the  addresses. 

1,  2  Basic  computer  relationships  It  is  desired  here  to  study  the  relationships  between 

the  parameters  of  a  general  purpose  stored  program  computer  in  the  same  way  that 

Turing  machine  theory  has  done  for  that  type.  Some  of  the  parameters  that  should  be 

related  are  memory  size,  word  size,  combinational  circuit  size,  operation  time  etc. 

Suppose  a  computer  has  a  storage  unit  S  consisting  of  M  binary  storage  elements.  Let 

P,  X,  and  Y  be  three  subsets  of  S,  where  P  and  X  are  disjoint.  Let  and  be  vectors 

with  binary  components  representing  the  contents  of  the  respective  memory  subsets  at 

time  t  (t  =  0,  1,  2. . .  ).  The  problem  which  the  computer  is  to  solve  is  to  make  jTa 

function  f  of  x  .  Note  that  x  and  y  may  have  many  components  and  y  may  represent  the 
— o  —  *- 

solution  to  a  differential  equation,  i.  e  *  function"  is  used  generally. 

Definition:  The  program  in  P  calculate  f  in  time  t  and  places  the  answer  in  Y  pro¬ 
vided  that 

(1)  Yjf  ~  £(*o)  for  all  possible  xq 

(2)  The  result  (1)  is  achieved  regardless  of  the  initial  states  of 
the  elements  of  S  not  in  P  U  X. 

(3)  If  the  initial  value  of  any  single  element  of  P  is  changed  result 

(1)  is  not  achieved.  (This  insures  that  every  part  of  P  is  necessary  at  least  if  the  program 
is  to  work  for  all  x.  ) 

(4)  For  no  t  less  than  t  are  the  above  true. 

(2n\ 

If  there  are  n  components  in  x  then  there  2'  '  distinct  single  variable  Boolean  functions 

which  can  be  formed  from  the  components  of  x.  This  means  that  there  must  be  at  least 
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distinct  settings  of  subsets  of  the  M  memory  elements. 


Thus 


,(2n) 


<  1  +  M  2  + 


<¥>  22  + 


(“>  23  + 


O 


■  M 


(1  +  2) 


M 


2n  <  M  log2  3 


If  Y_  has  r  components, 


there  are 


distinct  functions,  giving 


r2n<Mlog23  (LI) 

The  largest  memory  in  use  seems  to  be  that  of  the  IBM  Stretch  computer'  about 
16  800  000  bits.  If  r  =  n,  this  limits  n  to  about  20  bits,  certainly  much  smaller  than 
the  input  data  to  most  problems.  The  conclusion  is  that  no  computer  is  *  general  pur¬ 
pose*  is  the  strict  sense. 

Nothing  has  been  said  so  far  about  the  form  of  the  program  £.  As  is  evident  from 
Section  1.1  above,  there  are  a  great  number  of  ways  in  which  instruction  words  can  be 
formed,  and  a  general  theory  should  not  put  too  many  restrictions  on  the  way  the  stor¬ 
age  is  broken  down  into  words.  Note  that  if  part  of  the  program  contains  the  truth  table 
off,  that  part  alone  requires  r2n  storage  elements.  Thus,  Eq.  1. 1  is  a  little  optimistic 
in  assuming  all  subset  settings  are  useful  programs,  but  the  difference  in  calculating  n 
is  very  small.  As  was  suggested  in  the  previous  report the  program  is  essentially 
as  complicated  as  its  simplest  description  (the  truth  table).  This  follows  when  it  is 
realised  that  the  table  lookup  program  could  hardly  be  comparable  in  size  to  the  com¬ 
plete  table  of  r 231  entries.  It  appears  that  no  more  useful  bound  than  Eq.  1. 1  will  be 

found  unless  the  type  of  problem  (Boolean  function)  is  suitably  restricted.  This  same 

(9) 

question  was  raised  by  Shannon  in  connection  with  relay  contact  networks  His  con¬ 
clusion  seems  to  be  that  since  it  requires  an  enormous  number  of  contacts  to  realize  a 
randomly  selected  function  of  a  moderate  number  of  input  variables,  and  since  networks 
have  been  made  with  a  large  number  of  input  variables,  that  the  functions  of  common 
interest  must  be  of  some  special  type.  The  ideas  of  Shannon  on  functional  separability, 
group  invariance,  and  symetrical  functions  seem  as  appropriate  to  the  computers  as  to 
switching  networks,  the  action  and  complexity  of  switching  networks  and  computer  pro¬ 
grams  being  so  similar. 

It  is  not  so  surprising  that  even  the  laTgest  computers  cannot  calculate  an  arbitrary 
function  of  20  binary  variables  when  the  size  simplest  description  of  such  a  function  is 
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*  2^®  ^  such  functions,  so  the  description  requires  10^ 


examined.  There  are  2' 
bits.  This  is  about  200,  000  English  letters,  about  3000  lines  of  type,  or  about  66  pages 
of  print.1  This  would  be  66  pages  of  compact  mathematical  description,  a  description  in 
English  would  be  much  longer. 


A  rough  indication  of  how  time  of  computation  might  enter  the  theory  can  be  pro¬ 
vided  if  two  additional  assumptions  are  made: 

(5)  The  value  of  xis  independent  of  time  during  one  problem. 

(6)  The  time  of  execution  r  is  the  same  for  all  problems  f. 

Now  not  only  must  all  initial  programs  be  distinct,  but  so  must  all  intermediate  states 
of  the  initial  program  region.  This  leads  to 


r2n+  log^r  <  M  log^  3  (1.  2) 

For  a  given  r  and  n  the  computation  time  r  cannot  be  made  too  large  without  requiring 

larger  M.  This  seems  contrary  to  the  notion  that  a  sequential  computer  gains  economy 

by  using  its  parts  over  and  over.  However,  the  bound  is  obtained  for  arbitrary  functions, 

(9) 

for  separable  or  symmetric  functions'  'the  result  might  be  quite  different  and  long  ex¬ 
ecution  times  might  be  indicated.  Note  that  a  single  table-lookup  does  not  result  in  a 
very  large  value  for  log2T. 

It  is  intriguing  to  consider  a  computer  whose  only  basic  operations  are  "Sheffer 
stroke"  and  "Jump".  If  jr  is  to  be  a  general  function  of  x,  it  is  only  necessary  to  con¬ 
sider  r  separate  scalar  functions  y^  of  x.  Each  of  these  can  be  expressed  in  normal  form 
as  a  logical  sum  of  logical  products  (of  the  variables  and  their  negatives).  Each  product 
of  n  terms  can  be  obtained  by  successively  multiplying  just  2  terms,  and  similary  for 
sums.  But  each  of  the  3  Boolean  operations  (sum,  product,  negate)  can  be  expressed 
in  terms  of  the  single  binary  operation  Sheffer  stroke  X|  Y  (either  not  X  or  not  Y)^®’ 

The  stroke  operation  would  suffice  not  only  in  handling  the  data  x  but  also  in  bookeeping 
operations  as  the  program  itself  is  modified  then  g^  gj  •  •  •  £T-  For  ordinary  computer 
programs  the  expression  of  all  operations  (say  addition  of  two  10  digit  numbers)  in  terms 
of  the  stroke  function  would  result  in  a  great  increase  in  complexity.  However,  if  sub¬ 
routines  and  symbolic  programming  are  used,  the  program  writing  task  might  not  be  too 
bad.  The  control  and  'arithmetic  unit"  parts  of  the  computer  would  be  very  simple,  as 
in  a  Turing  machine.  The  problem  remains,  however,  that  if  nothing  is  done  to  limit 
the  type  of  function  which  the  computer  is  to  handle  any  interchange  between  control  and 
*  arithmetic  unit"  complexity  and  storage  capacity  will  be  masked  by  the  sise  of  the 
truth  table. 
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1.  3  Tabulation  of  form  state  binary  machines  In  a  previous  report a  tabulation  of 
these  state  binary  machines  was  made,  and  this  has  proved  very  useful  in  giving  exam-- 
ples  of  machines  with  certain  combinations  of  properties  (eg.  non-resettable,  non-peri¬ 
odic  and  simple).  An  n  state  binary  machine  will  be  defined  here  as  a  collection  of  n 
states  with  a  O  arrow  and  a  1  arrow  leaving  each  state  and  ending  in  a  state, possibly  the 
same  state.  The  machine  is  to  be  strongly  connected,  le.  every  state  must  be  reached 
from  every  other  state. 

Consider  the  graph  shown  in  Fig.  1.  2a.  This  is  topologically  the  same  as  that 
shown  in  Fig.  1.  2b.  Also,  for  the  purposes  of  this  enumeration.  Fig.  1.  2c  is  considered 
the  equivalent  of  the  other  two.  It  is  desired  to  tabulate  only  topologically  distinct  grapha 
so  that  different  orientations,  reflections,  notations  etc.  do  not  yield  new  cases;  and  if 
all  the  labels  on  the  arrows  are  simultaneously  changed,  the  resultant  graph  is  not  con¬ 
sidered  distinct.  The  big  problems  in  making  such  a  tabulation  are  to  Include  all  possible 
graphs  and  not  to  include  two  equivalent  graphs.  The  method  adopted  is  to  start  with  a 
simplified  graph,  and  then  to  add  details  in  such  a  way  that  each  graph  can  be  formed  in 
one  and  only  one  way.  As  shown  in  Fig.  1.  3,  the  graph  of  Fig.  1.  2a  is  the  fifth  version, 
the  steps  in  adding  loops,  adding  multiple  connections,  adding  directions  to  the  connec¬ 
tions  and  finally,  labeling  the  arrows.  The  number  of  four  state  diagrams  found  at  each 
step  (counting  only  those  leading  to  possible  find  graphs  meeting  the  required  conditions), 
are  5,  24,  108,  as  shown  in  Fig.  1.  3,  and  there  are  460  graphs  in  the  final  tabulation. 
There  are  actually  six  graphs  of  the  type  shown  in  Fig.  1. 3a,  but  one  (F)  does  not  lead 
to  any  (binary)  graphs  meeting  the  conditions.  The  number  of  distinct  graphs  of  each 
type  are  shown  in  Fig.  1.4.  As  an  example  of  how  the  classification  is  carried  out. 

Fig.  1.  5  shows  the  variants  of  D.  Consider  03,  there  are  already  8  transitions  so  that 
no  double  connections  can  be  added.  The  arrows  can  only  be  drawn  in  two  ways  shown 
in  Fig.  1.  6a  and  b.  Now  the  first  arrow  can  be  labeled  either  I  or  1,  so  label  the  diagonal 
arrow  In  all  cases  O.  This  immediately  requires  one  other  arrow  to  be  labeled  1  in  each 
graph.  Fig.  1.  6a  has  no  symmetries,  so  that  the  remaining  3  arrows  (3  being  always 
oppositely  labeled)  can  be  labeled  in  8  different  ways.  On  the  other  hand,  it  does  not 
matter  in  Fig.  L  6b  whether  the  lower  left  state  is  labeled 

or 
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because  of  symmetry,  therefore  only  4  graphs  result  from  Fig.  1.  6b.  The  108  label¬ 
less  directed  graphs  such  as  in  Fig.  1.  6a,  b  can  all  be  labeled  in  8,  6,  4,  3,  2, 1  ways 
according  to  the  types  and  numbers  of  symmetries  they  possess. 

Four  states  can  be  represented  by  the  settings  of  a  pair  of  flip-flops,  so  that  there 
are  460  ways  to  feed  a  single  binary  input  to  a  pair  of  flip-flopsi  each  resulting  in  a 
different  behavior  pattern,  but  each  allowing  every  state  to  be  reached  from  every  other. 
If  this  pair  of  flip-flops  is  to  be  connected  into  a  layer  network  interchanging  the  O's  and 
I's  atthe  input  and  relabeling  the  states  can  result  in  different  behavior  of  the  larger  net¬ 
work,  and  if  these  changes  are  considered  to  give  distinct  machines  the  total  number 
possible  will  be  somewhere  between  460  and  2x4'.  x  460  types.  Although  symmetries 
will  keep  the  number  considerably  below  the  upper  bound,  the  number  of  ways  of  connect¬ 
ing  just  two  flip-flops  is  remarkably  large. 

The  tabulation  of  the  460  graphs  was  done  some  time  ago  but  was  never  published 
because  it  was  not  sufficiently  checked.  Recently  this  number  has  been  checked  by  a 
thesis  student,  Mr.  Donald  Chin.  A  tabulation  of  the  460  types  will  be  prepared  as  a 
separate  report. 

2. 1  Semigroups  without  the  descending  chain  conditions  In  dealing  with  the  semigroup 
corresponding  to  symbol  concatenation  in  languages,  it  is  usually  assumed  that  the  semi¬ 
group  is  free  (no  relations  between  the  generators),  or  that  it  satisfies  the  descending 
chain  condition  (any  sequence  of  strings  obtained  by  cancelling  symbols  from  beginning 
or  end  must  terminate).  These  conditions  are  equivalent  and  they  imply  a  unique  factor¬ 
isation  of  any  string  into  symbols.  In  dealing  with  compound  code  the  semigroup  symbols 
are  matrices,  the  descending  chain  condition  does  not  hold,  and  the  factorization  is  not 

(13) 

unique.  In  the  Third  Semi-annual  report  ,  the  descending  chain  condition  was  replaced 
(13) 

by  Postulate  14'  ',  that  every  element  have  some  finite  factorization.  It  is  desired  here 

-  (13) 

to  study  the  structure  of  semigroups  having  neither  Postulate  14'  1  nor  the  descending 
chain  condition,  but  which  nevertheless  allow  certain  canonical  factorizations. 

Consider  a  set  of  elements  S  with  a  closed  binary  operation  (indicated  by  juxtapos¬ 
ition)  obeying  the  postulates  P  1  to  P  6  below. 

PI:  For  every  x,  y,  and  z  in  S,  x(yz)  ■  (xy)z.  A  zero  will  be  diflned  as  an  element 
6  which  obeys  8  x  b  x  6  <>  8  for  all  x  in  S. 

P2:  For  every  x,  y,  and  z  in  S,  xy  ■  xz  implies  y  =  z,  provided  that  xy  4  ®.  Simi¬ 
larly  yx  ■  zx  implies  y  *  z,  provided  that  yx  /  8. 

P3:  For  every  w,  xr,  y,  and  z  in  S,  wx  ■  yz  /  8  implies  that  either  a  u  exists  in  S> 
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such  that  wu  =  y,  or  that  a  v  exists  in  S  such  that  w  *  yv.  The  dual  statement  can 
be  taken  as  a  postulate  or  proved  from  the  other  postulates. 

P4:  Each  element  has  a  right  and  left  unit,  possibly  depending  on  the  element. 

That  is,  for  each  x  in  S  there  exists  some  u  in  S  such  that  xu  ■  x  and  some  v  in  S  such 
that  vx  m  x.  An  element  e,  not  equal  to  6,  will  be  difined  as  idempotent  if  ee  »  e.  The 
following  lemmas  will  be  useful  in  proving  the  main  theorems: 

L  1:  The  zero,  if  it  exists,  is  unique.  Suppose  Oj  is  a  unit.  Then  8^x  a  0  for  all 
x,  including  any  other  unit  02»  But  if  02  is  a  unit  6^62  =  02>  Therefore  0^  «  0 

L  2:  Similarly,  there  is  no  element  w(/  0)  such  that  wjt  =  0  for  all  x  in  S.  By  P4, 
there  exists  a  u  such  that  wu  =  w,  but  multiplication  is  unique.  Note  that  there  may  be 
divisors  of  zero,  ie.  pairs  of  elements  x,  y  such  that  xy  =  0. 

L  3:  For  any  particular  element  x  in  S,  the  right  unit  is  unique  and  the  left  unit  is  • 
unique.  Proved  by  P  2. 

L  4:  Every  idempotent  is  the  unit  of  some  non-zero  element:  and  every  unit  of  a 
non-zero  element  is  idempotent.  For  every  idempotent  e  is  the  unit  of  itself  (ee  b  e); 
and  if  u  is  a  unit  of  x(/  0)  it  follows  that  x  =  xu  b  xuu,  therefore  u  =  uu  by  P  2. 

L  5:  The  product  of  two  distinct  idempotents  is  zero.  For  e^e2  a  e]®2®2  ^mP^es 
either  e^  a  0,  or  by  P  2,  e^  »  e^.  But  e^  »  e^,  therefore  ■  e2  by  L  3. 

L  6:  The  set  of  elements  eS,  where  e  is  an  idempotent  of  semigroup  S,  is  a  non¬ 
empty  subsemigroup  of  S.  It  is  non-empty  since  S  contains  the  right  unit  of  e  and  eS 
then  containts  at  least  e.  Let  es^  and  es2  be  two  elements  of  eS,  then  (es^)  (es2)  a 
e(s^es2)  c  eS. 

L  7;  Let  e^  and  e2  be  two  distinct  idempotents  of  S,  then  e^S  and  e^S  have  no 
common  elements  except  8.  Suppose  e^x  a  e2y  /  0  where  x  and  y  are  elements  of  S.  By 
P  3  either  there  exists  a  u  in  S  such  that  e^u  a  e2>  or  a  v  in  S  such  that  e^u  a  e2,  or 
a  v  in  S  such  that  e^  a  e2v.  In  the  former  case,  multiply  both  sides  by  e^:  e^u  a  «^e2  a 
0  (by  L5),  then  e2y  a  0  contradictingthe  original  hypothesis.  A  similar  argument  applies 
to  the  second  possibility. 

L  8:  Every  element  of  S  is  in  one  of  the  subsemigroups  e^S.  Consider  any  element 
x  of  S.  Its  left  unit  is  an  idempotent  by  L  4.  If  e^  is  the  left  unit  of  x,  then  e^x  a  x  and 
x  must  be  in  e^S.  Lemmas  6  to  8  apply  dually  to  the  subsemigroups  Se. 

L  9:  The  class  e^Sej  is  a  subsemigroup  and  it  is  disjoint  with  e^Se^  unless  i  ■  k 
and  j  a  h.  Every  element  of  S  is  in  one  of  e^Se^.  Also  e^Se^  (i  /  j)  is  a  null  semigroup, 
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le.  all  product*  are  8.  Proof  aimilar  to  above. 

The  lemma*  presented  above  show  that  S  can  be  expanded  into  diajolnt  semigroup* 
in  several  ways: 

S-2  e^S  (right  ideals)  (2. 1) 

i 

S  »  Yi  Sej  (left  ideals)  (2.  2) 

j 

S»E  eiSej  (2. 3) 

i.  j 


These  are  analogous  to  expanding  a  group  into  cosets.  An  expansion  into  two  sided  ideals 
can  be  obtained  by  applying  Eq.  2. 1  to  S  m  SS: 

S«£  SejS  (2. 4) 

i 

Now,  in  order  to  prevent  the  semigroup  from  consisting  of  many  subsemigroups  entirely 
disconnected  with  each  other,  assume: 

_P  5:  For  every  e^  and  e^  there  exists  elements  <j)^  of  S  such  that 
ei  *  +U  ej  +ji 


Theorem  1  The  following  expansion  of  the  semigroup  S  holds: 

S  “  +ik  Skk  *kj  {ior  any  k)  <2.5) 

where  =  ekSek 
+iiaei 

This  theorem  is  easily  proved  using  P  5  and  Eq.  2,  3.  It  follows  that  every  element  s 
in  S  has  an  expansion 

■  B  *ik  Bkk  +kj  {Z’ 6) 


where  there  are  several  choices  for  s^  but  where  1  and  j  are  uniquely  determined.  The 
subsemigroups  4^  4^  are  isomorphic  for  different  k,  i  and  j  fixed;  and  are  dis¬ 

joint,  Finally: 

P6  The  subsemigroup  SU  -  el  ‘  9  is  freely  generated  by  g2  •  •  •  8n*  follows 
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now  that  any  element  of  S  can  be  expanded  uniquely  according  to  Eq.  2,  6  aa 

•*  *iifcl*2  •••  **  *ij  (2-7) 

where  =  gj,  •  •  •  gn  and  i  is  the  'length*  (possibly  0)  of  s. 

Theorem  2:  In  S  there  are  only  5  types  of  elements: 

1)  the  aero  6 

2)  the  idempotents  e^ 

3)  the  (i  4  j) 

4)  the  generators  gj,  gg  • • . g 

5)  composite  elements  as  in  Eq.  2.  7. 

The  abstract  formulation  given  above  can  be  clarified  by  noting  that  8  is  a  matrix 

of  empty  sets;  the  elements  consist  of  a  matrix  with  a  null  string  in  the i*  th  row  and 

j'thcolumn  and  the  other  elements  empty  sets;  e^  =  <j>^  that  is  a  null  string  on  the  diagonal; 

the  generator  g^  consist  of  a  single  letter  in  the  first  row,  first  column  and  empty  sets 

elsewhere;  and  composite  elements  have  a  single  string  in  one  position  of  the  matrix  and 

(12) 

empty  sets  elsewhere.  The  whole  semigroup  S  is  the  set  of  union-irredicible  elements'  \ 
The  abstract  formulation  is  given  in  an  attempt  to  obtain  a  mathematical  system  which  is 
neither  too  specific  nor  too  general,  and  therefore  to  obtain  a  one-to-one  correspondence 
between  coding  theorems  and  algebraic  theorems.  By  this  means  it  is  hoped  to  extend 
theorems  about  simple  codes  to  compound  codes,  as  was  done  for  the  Sardinas -Patterson 
algorithm 

2.  2  Further  results  in  the  code  string  algebra  The  code  string  algebra  which  was 
developed  in  previous  reports deals  with  matrices  of  sets  of  symbol  strings,  allowing 
the  null  string  Let  *  be  any  binary  closed  operation  which  distributes  over  addition 
4  (set  union,  position  by  position  in  the  matrix).  If  G  £  H  is  defined  to  mean  that  there 
exists  a  (possible  empty)  matrix  X  such  that  G  +  X  *  H,  then 

G  £  H  implies  P*G*QCP*H*Q  (2.  8) 

because  P*G*Q+Y«P*H  *  Q  where  Y  *  P  *  X  *  Q.  Here  *  might  stand  for  *  (con- 
tatenation  or  multiplication),  or  \  ,  or  /  (divisions).  Similar  reasoning  show  that 

G  £  H  and  I  £  J  implies  G  *  I  £  H  *  J  (2.  9) 

The  relation  between  *  and  intersection  can  now  be  established:  First  note  that  since 
G  n H  £  G,  by  Eq.  2.  9  it  follows  that  P  *  (GflH)*  Q  £P  *  G  *  Q.  Next  note  that  A  C  B 
and  A  £  C  implies  A  £  B  n  C.  Set  A  »  P  *  (G  fl  H)  *  Q,  B  *  P  *  G  *  Q,  andC«P*H*Q 
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and  it  follows  that 

P  *  (G  0  H)  *  Q  C  {P  *  G  *  Q)  ft  (P  *  H  *  Q)  (2. 10) 

again  where  *  represents  either  concatenation  or  division.  Of  course,  one  sided  rela¬ 
tions  can  be  obtained  from  all  of  the  above  equations  by  setting  either  P  or  Q  equal  to  a 
diagonal  matrix  of  null  strings. 

In  ordinary  matrix  theory  (elements  of  the  matrix  in  a  commutative  ring),  the  law 
(AB)'  =  B'A*  holds,  where  prime  indicated  the  transpose.  This  is  not  true  in  the  code 
string  algebra  since  the  semigroup  of  element  multiplication  is  not  commutative.  If 
T  (cj)  a  matrix  with  one  null  string  in  every  row  and  one  null  string  in  every  column, 
and  if  ♦  is  a  matrix  with  a  null  string  in  every  position  on  the  main  digonal,  then 

T  T'  «  T*  T  =*  ■*  (2. 11) 

Tl  T2  c  J  if  Tj  £  J  and  T2  e  J  (2. 12) 


These  follow  from  the  fact  that  the  T  matrices  represent  permutations.  Now  it  follows 
that 


As  TXT1' 
B  «  T  Y  T1. 
A*  T  X  T' 


implies 


implies 


ABsTXYT1 
An  -  T  Xn  T' 


(2.13) 

(2.14) 


A  matrix  A  will  be  defined  as  automorphic  if  As  TAT1  with  T  /  Any  power  of  an 
automorphic  matric  is  automorphic  by  eq.  2. 14. 

2.  3  Resettabillty  of  finite  state  machines  In  another  report  ^^there  have  been  tabu¬ 
lated  the  state  diagrams  with  1,  2  and  3  states  such  that  exactly  two  arrows  (labeled  0 
and  1,  the  machine  inputs)  leave  each  state  and  such  that  every  state  can  be  reached 
from  every  other  state.  Only  topologically  distinct  graphs  are  shown,  ie.  graphs  ob¬ 
tained  by  interchanging  the  labels  on  all  arrows  simultaneously,  or  by  premuting  the 
states,  or  both  are  not  shown.  Under  the  restrictions  it  was  found  that  there  is  1  graph 
of  1  state,  that  there  are  4  graphs  of  2  states,  and  29  graphs  of  3  states  .  The  graphs 
have  been  classified  according  to  3  properties  (simple  or  compound,  resettable  or  not, 
bounded  delay  or  not)  thus  producing  8  categories.  It  is  necessary  to  go  to  4  state  graph* 

in  order  to  give  examples  in  2  of  these  categories,  this  is  done  on  the  next  page  of  the 

0.21 

report1  \  A  graph  is  resettable  if  and  only  if  a  reset  signal  exists;  a  reset  signal  is  a 
sequence  of  O*  s  and  1*  s  which  will  put  the  machine  in  a  single  definite  state  no  matter 
Some  errors  have  been  found  in  this  tabulation:  graphs  14, 15  and  32  have  2  components 
and  not  3;  graph  5  has  the  lower  left  arrow  in  the  wrong  direction. 
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what  the  starting  state  is.  The  concept  of  reachability  is  qviite  similar  to  the  concept 

of  ergodicity  in  physics;  if  a  resettable  machine  is  fed  with  a  random  input  sequence  of 

O's  and  l's  the  effect  of  the  starting  state  on  the  probability  of  the  present  state  grad- 

ually  "wears  off*  as  the  length  of  the  input  sequence  increases.  It  is  desired  to  obtain 

necessary  and  sufficient  conditions  for  a  machine  to  be  resettable.  As  a  first  step,  the 

(121 

reasons  why  12  of  the  36  machines  in  the  old  report'  '  are  not  resettable  will  be  ex- 

(14) 

amined.  The  string  set  matrix'  '  of  the  machines  under  consideration  will  have  a  single 
O  and  a  single  1  in  every  row,  and  will  have  no  null  strings.  Here  the  strings  represent 
inputs,  there  is  a  single  O  in  each  row  because  the  machine  must  know  how  to  change 
state  and  can  only  go  to  one  state,  and  there  are  no  null  strings  since  they  would  repre¬ 
sent  spontaneous  state  changes.  Consider  various  powers  of  the  matrices  of  2  states: 
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The  codes  and  Mgj  are  resettable,  but  and  M^g  are  not.  The  n'th  power  of 

the  code  matrix  shows  the  final  state  (column)  as  a  function  of  the  initial  state  (row) 

and  the  input  signal  of  n  symbols.  The  code  is  resettable  if  and  only  if  one  signal  appears 

in  every  row  of  some  column  (possibly  accompanied  by  other  signlas)  in  some  power  of 

the  matrix.  If  such  a  signal  does  appear  every  row  of  some  column  of  the  n  'th  power, 

there  will  then  be  a  signal  appearing  in  every  row  of  some  column  of  the  n*  th  power 

where  n  >  n  .  In  fact,  once  one  column  has  such  a  reset  signal  in  it,  all  columns  can 
-  o 

be  made  to  contain  resets  by  increasing  n.  The  smallest  nQ  for  which  resets  appears 
gives  the  length  of  the  shortest  reset  signal,  nQ  «  1  in  the  above  examples.  A  few  more 
examples  will  be  given: 
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where  inclusion  and  matrix  multiplication  are  defined  as  in  previous  reports  Some 
conditions  on  M  which  present  M  from  being  resettable  can  now  be  established.  Accord¬ 
ing  to  Section  2.  2  above  if  M  is  automorphic,  ie. 

M  *  T  M  T1 

then  Mn  will  be  automorphic  and 

CCTMnT' 

Now  by  Eq.  2.  8  above, 

T  C  T*  £  T  MnT' 

But  TCT1  has  the  same  string  as  C  but  in  a  different  column,  and  therefore  both  can¬ 
not  be  in  the  same  matrix  if  the  matrix  represents  a  determinate  machine.  An  automor¬ 
phic  matrix  corresponds  to  a  machine  with  a  proper  automorphism,  that  is  the  machine 
looks  the  same  a  permutation  of  the  states.  If  the  state  diagram  looks  the  same  after 
permuting  the  states,  any  reset  signal  would  have  the  same  effect  on  the  graph  as  on  the 
permuted  graph.  The  existence  of  automorphisms  is  then  the  same  as  the  existence  of 
certain  types  of  symmetry  in  the  state  diagram. 

Certain  other  cases  of  non-resettability  can  be  recognized.  A  state  diagram  is 
defined  as  periodic  if  the  greatest  common  division  of  the  path  lengths  which  start  at  a 
fixed  state  and  return  to  that  state  is  greater  than  unity.  If  this  g.  c.  d.  is  equal  to  p, 
then  the  graph  has  a  period  p  and  the  states  can  be  divided  into  p  classes  K^,  Kj  >•■ 
such  that  either  a  0  or  a  1  causes  a  transition  from  Kj  to  (or  from  to  Kj).  This 
is  shown  in  the  analysis  of  Markov  precesses,  eg.  Feller^^,  page  330.  A  periodic 
machine  arises  if  a  uniform  code  is  decoded,  and  this  is  one  of  the  classes  of  non-ergodic 
codes  given  by  Schfltzenberger 

The  next  class  of  non-resettable  graphs  arises  from  the  fact  that  it  is  necessary 
to  have  at  least  one  state  with  two  or  more  0  arrows  entering,  or  at  least  one  state  with 
two  or  more  1  arrows  entering.  If  no  such  state  exists,  then  the  last  digit  of  the  reset 
signal  cannot  bring  together  the  final  ambiguous  class  of  states.  But  if  no  state  with  at 
least  two  similarly  labeled  arrows  entering  it  exists,  then  every  state  must  have  just 
two  arrows  entering  it,  a  0  arrow  and  a  1  arrow.  This  is  true  because  the  total  number 
of  arrows  is  equal  to  twice  the  number  of  states,  and  a  single  arrow  entering  one  state 
implies  three  or  more  entering  another.  The  particular  type  of  state  diagram  considered 
here  might  then  represent  a  binary  determinate  machine  if  all  arrows  are  reversed  in 
direction.  Since  it  makes  sense  going  backward,  such  a  graph  will  be  called  a  palindrome. 


Note  that  it  may  or  may  not  represent  exactly  the  same  machine  if  the  arrows  are 
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(12) 

reversed;  numbers  29  and  32  in  Fig.  30'  '  do  represent  the  same  machine.  The  idea 
of  reading  backward  occurs  in  a  palindrome  and  in  the  anagrammtic  codes  of  Sch&teer- 
berger  '  but  the  decoder  of  a  simple  code  with  the  prefix  property  cannot  be  a  palin¬ 
drome  since  all  words  end  in  the  base  state.  It  is  easy  to  see  why  a  palindrome  cannot 
satisfy  Eq.  2. 15  above.  In  a  palindrome  a  0  input  permutes  the  state,  and  so  does  a  1 
input.  Th  refore,  the  M  matrix  is  merely  the  sum  of  two  T-type  matrices  (single  ele¬ 
ment  in  each  row  and  column),  one  for  0  and  one  for  1.  Any  power  of  M  will  be  a  sum 
of  T-type  matrices: 

M  *  T0  +  Tj 


M 


=  T0T0+T0T1+T1T0+T1T1 


Since  each  input  string  appears  in  only  one  terms  of  this  sum,  there  cannot  be  a  column 
of  identical  strings. 

The  third  class  of  non-resettable  machines  will  be  those  in  which  an  endomorphic 

image  is  non-resettable.  Here  an  endomorphism  is  a  mapping  from  the  set  of  states  to 

a  proper  subset  of  states  such  that  the  endomorphic  image  has  a  single  0  arrow  and  a 

(12) 

single  1  arrow  and  a  single  1  arrow  leaving  each  state.  In  Fig.  30'  number  27  has 
the  following  endomorphism:  A-«*A,  B-»B,  C-*B.  This  is  shown  in  Fig.  2.1a  Some 
periodic  machines  are  a  special  case  of  this  class,  since  they  have  an  endomorphic  image 
which  is  a  closed  circle  of  p  states  (a  palindrome).  A  resettable  machine  with  a  resett¬ 
able  endomorphic  image  is  shown  in  Fig.  2. 1  b.  Examples  of  machines  with  no  endo- 
morphic  images  except  the  single  state  machine  are  Fig.  30'  1  number  6  (resettable) 

and  Fig.  30  1  number  23  (non-resettable,  a  periodic  circle).  There  are  non-resettable 

(12) 

machines  in  which  all  endomorphic  are  resettable  images,  eg.  Fig.  31  a. 


The  three  causes  of  non-resettability  given  above  account  for  all  cases  involving 
two  and  three  state  machines,  but  examples  of  four  and  five  state  machines  can  be  given 
which  are  not  periodic,  not  palindromes,  and  which  have  no  non-resettable  endomorphic 
images.  The  first  example  is  related  to  the  uniformly  composed  codes  given  by  Schutxen- 
berger ^  ^  as  one  of  the  classes  of  non-ergodic  codes.  A  uniformly  composed  code  can 
be  derived  by  replacing  the  symbols  in  a  non-binary  code  by  the  binary  words  of  unequal 
length. 
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ENDOMORPHIC  IMAGE 


(o)  NON- RESETTABLE  MACHINE  (M27) 


(b)  RESETTABLE  MACHINE  (Mi») 
Fig.  2.1  ENDOMORPHISMS 


(AsXI,B=YI,C*X4,D=Y4  ETC.) 


RESETTABLE  PERIODIC 

NON-RESETTABLE 

TANOEM  MACHINES 


Fig.  2.2  Uniformly  Composed  Cod* 
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a  a 
a  b 
a  c 
b  a 
b  b 
b  c 
c  a 
c  b 
c  c 

with  the  replacement  a=  0,  b=  10,  c  =  11.  Even  if  the  set  of  binary  words  is  such  that 
the  ternary  letters  can  be  identified,  there  will  never  be  any  way  of  knowing  whether  the 
ternary  letter  is  the  first  or  last  in  one  of  the  nine  words  above.  The  example  given 
in  the  coding  report  in  Fig.  31'  '  a  can  be  derived  from  Schutzenberger*  s  example  as 

shown  in  Fig.  2.  2.  The  four  state  example  can  be  derived  more  directly  and  shown  in 
Fig.  2.  3.  It  might  be  thought  that  non-resettability  might  arise  in  more  complex  ways 
by  hiding  a  non-resettable  machine  in  a  complex  network  of  resettable  machines,  as  in 
Fig.  2.  4.  This  can  be  put  in  the  form  of  Fig.  2.  3  by  grouping  the  components  if  there 
are  no  feedback  loops,  as  shown  by  the  dashed  lines.  If  the  input  feeds  the  non-resett- 
bble  machine  directly,  as  in  Fig.  2.  5,  there  will  be  a  non-resettable  endomorphic  image. 
The  type  of  non-resettability  described  above  will  be  referred  to  as  due  to  a  composite 
machine  having  a  non-resettable  component. 

A  non-resettable  machine  which  fits  into  none  of  the  above  four  categories  is  shown 
in  Fig.  2.  6.  The  fact  that  this  machine  has  a  prime  number  of  states  does  not  rule  out 
its  being  a  composite  machine  since  a  transient  state  is  possible,  but  if  it  were  compos¬ 
ite  there  would  be  an  endomorphic  image.  The  machine  of  Fig.  2.  6  is  an  endomorphic 
image  of  the  machine  of  Fig.  2.  7,  which  is  the  decoder  of  the  anagrammic  code  of 
Schdtzenberger 


0  0  0 
0  0  10 
0  0  11 
0  1 
1  0  0 
10  10 
10  11 
1  1  0 
1  1  1 


This  code  is  complete  and  has  the  prefix  property  both  from  the  left  and  from  the  right. 
Now  Fig.  2.  6  can  be  obtained  from  Fig.  2.  8  by  the  coalescing  process  shown  in  Fig.  2.  9. 
Here  it  is  necessary  to  find  a  pair  of  states  such  as  and  Xg  whose  0  arrows  go  to  the 
same  Y  and  whose  1  arrows  go  to  the  same  state  Z.  Note  that  X,  Y,  a,  b,  c,  d,  e,  need 


resettable  non- resettable 


NON -RESETTABLE 


Fifl.2.3  Tandem  Machines 
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EXPANOED  FORM  CONTRACTED  FORM 


Fig.  2.9  Coalctcing  Proc«M 
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not  be  distinct  from  each  other  or  from  and  Xg*  The  converse  process  of  splitting 
a  state  into  two  states  can  always  be  carried  out.  Now  it  is  obvious  that  the  property 
of  resettability  is  invariant  to  coalescing  or  splitting.  To  show  this  note  that  if  a  reset 
sequence  for  state  Y  exists  in  Fig.  2.  9  on  the  right  machine  it  also  will  reset  the  left 
machine  to  Y,  and  conversely.  If  no  such  reset  signal  exists  for  one  machine,  none 
exists  for  the  other.  A  machine  M*  which  can  be  obtained  from  a  machine  M  by  repeated 
coalescings  will  be  said  to  be  a  contracted  form  of  M,  and  M  will  be  said  to  be  an  ex¬ 
panded  form  of  M1 .  A  contracted  machine  is  an  endomorphic  image,  but  not  every 
endomorphic  image  can  be  obtained  by  coalescing.  Since  Fig.  2.  6  is  a  contracted 
form  of  the  non-resettable  Fig.  2.  8,  it  cannot  be  resettable.  It  remains  to  show  that 
Fig.  2.  8  is  non-resettable.  In  Fig.  2. 10  a  tandem  connection  of  a  binary  resettable 
machine  and  a  ternary  non-resettable  machine  (palindrome)  is  shown  to  be  equivalent 
to  Fig.  2.  8. 

There  are  several  sufficient  conditions  for  resettability  which  are  known.  For 
example,  the  set  of  states  marked  Zy  Z^,  in  Fig.  2. 11  has  the  property  that  a 
sequence  of  l's  eventually  makes  the  state  Z^.  If  all  states  are  in  such  a  set  the  graph 
is  naturally  resettable.  If  a  sequence  of  O's  results  in  the  only  possible  states  being 
in  one  of  the  Z'  s,  then  the  graph  is  also  resettable,  and  so  forth. 

The  problem  of  resettability  does  not  seem  to  have  been  treated  in  the  literature 

(18) 

of  finite  state  machines.  Ginsburg  describes  some  similar  problems  in  his  book' 
but  very  strong  assumptions  are  made  concerning  the  existence  of  output  signals  from 
the  machine. 

3. 1  Testing  for  resettability  The  only  algorithm  which  has  been  mentioned  here  for 
determining  whether  or  not  a  particular  finite  state  machine  is  resettable  or  not  is  that 
implicit  in  Eq.  2. 15,  and  this  becomes  an  algorithm  only  when  a  bound  is  found  for  the 
length  of  a  reset  signal.  A  non-algebraic  algorithm  will  now  be  given  which  will  pro¬ 
vide  such  a  bound,  and  which  is  very  useful  for  visualizing  the  resetting  process.  Con¬ 
sider  n  identical  machines  fed  by  the  same  input  (where  n  is  the  number  of  states)  but 
each  started  in  a  different  state.  The  composite  machine  will  have  a  state  diagram 
which  will  show  at  a  glance  whether  the  original  machine  is  resettable  and  what  the 
shortest  reset  signal  is.  The  algorithm  will  be  illustrated  using  the  machine  of 
Fig.  3 (P~^  of  the  old  report  and  Section  2.  3  above.  Refer  to  Fig.  3. 1  and  note  that  the 
state  "ABC*  might  be  thought  of  as  either  "A  and  B  and  C"  (parallel  machines)  or  as 
■A  or  B  or  C*  (indicating  complete  uncertainty  as  to  the  state).  After  the  input  "O” 
the  state  C  becomes  impossible  and  therfore,  *  AB"  results.  The  resulting  composite 
graph  has  n  +  ( j)  +  (”)  +  . . .  +  1  =  2n  -  1  states,  and  the  graph  can  be  finished  until 
aO  and  a  1  arrow  leave  each  state.  The  original  graph  will  appear  as  part  of  the 
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composite  graph,  those  *  single  letter*  states.  If  a  path  is  found  from  the  composite 
state  representing  all  states  to  any  state  of  the  original  graph,  then  the  graph  is  re¬ 
settable.  The  shortest  path  gives  the  shortest  reset  signals.  Since  the  machine  is  de¬ 
terminate,  there  will  be  no  path  from  a  state  with  r  letters  to  a  state  with  more  than  r 
letters,  and  in  particular  no  arrows  leave  the  original  graph.  In  Fig.  3. 1  it  can  be 
seen  that  the  shortest  reset  signals  are  01  and  11,  and  that  the  shortest  signal  which  in¬ 
sures  state  B  is  010.  Similarly,  a  signal  which  helps  the  machine  continually  in  an  un¬ 
known  state  is  10000. ...  In  Fig.  3.  2  in  shown  a  graph  which  is  not  resettable,  the 
periodic  machine  Mgg.  Palindromes  have  both  arrows  returning  to  the  all-state  node: 


(12) 

In  a  bounded  delay  machine'  ,  all  sequences  longer  than  a  certain  amount  are  resets. 

It  is  now  apparent  that  the  longest  path  from  the  all-state  to  a  single-state  can  have  no 
more  than  2n  -  n  -  1  symbols. 

In  Fig.  3.  2  it  will  be  noticed  that  starting  from  ABC,  a  closed  set{AB,  ACjis 
reached.  Every  determinate  connected  machine  will  have  one  and  only  one  such  a  closed 
set  which  can  be  reached  from  the  all-state.  This  closed  set  will  have  the  following 
properties: 

1)  all  state  labels  have  the  same  number  of  letters 

2)  every  letter  (original  state)  appears  at  least  once  in  the  closed  set 

Let  the  closed  set  have  labels  with  X  letters.  If  X  =  1  the  machine  is  resettable,  if  X>  1 
it  is  not  resettable.  If  X  =  n  the  machine  is  a  palindrome,  as  defined  in  Section  2.  3 
above. 

3.2  Calculation  of  code  *  compression*  The  coding  process  which  is  to  be  considered 
here  is  sometimes  called  *  recoding*  :  a  sequence  of  discrete  symbols  from  an  informa¬ 
tion  source  is  decomposed  into  message  words,  these  words  are  transformed  into  signal 
words  by  the  encoder  and  transmitted  over  the  channel,  the  decoder  decomposes  the 
sequence  of  signal  words  and  transform  then  back  to  message  words,  and  finally,  the 
message  words  are  strung  together  to  form  the  original  source  sequence.  The  *  com¬ 
pression*  achieved  by  the  code  is  the  ratio  of  the  length  of  a  source  sequence  to  the 
length  of  the  corresponding  channel  sequence,  provided  that  the  same  size  alphabet  is 
used  in  each  sequence.  Since  the  ratio  referred  to  depends  on  the  length  of  the  sequence 
and  on  the  particular  words  in  the  sequence,  the  definition  must  be  made  more  explicit. 
It  is  the  purpose  of  this  section  to  compare  several  formulas  for  calculating  this 
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compression.  If  the  set  of  message  words  is  fixed  and  their  occurrence  statistically 
independent.  Huffman  has  shown  how  to  maximize  the  compression  ly  properly 
chosing  the  channel  words.  If  the  whole  recoding  process  is  considered  there  will  not 
generally  be  a  single  most  efficient  code  because  the  longer  the  words  (and  the  more 
of  them  there  are)  the  more  compression  will  be  achieved.  There  may  be.  however,  a 
maximum  compression  for  codes  of  a  given  *  complexity" .  The  complexity  of  a  code 
should  be  defined  in  terms  of  the  size  of  the  apparatus  required  to  encode  and  decode 
it,  and  an  exact  method  for  calculating  complexity  necessarily  involves  a  method  of 
sequential  apparatus  synthesis. 

The  amount  of  compression  achieved  by  a  code  depends  not  only  on  the  code,  but 

on  the  information  source.  The  sources  to  be  considered  here  will  be  finite  Markov 

processes  with  fixed  transition  probabilities.  The  encoder  does  not  know  the  internal 

states  of  the  source  during  the  process,  but  there  is  a  binary  word  associated  with  each 

state  transition  and  this  is  fed  to  the  encoder.  The  source  can  be  assumed  to  be  in  the 

following  standard  form  without  loss  of  generality:  There  are  m  internal  states  S^,  S^. 

. . .  Sm.  From  each  state  there  are  exactly  two  transitions  possible,  one  generating  a 

0  and  the  other  a  1  to  be  fed  the  encoder.  These  two  transitions  can  be  represented  by 

arrows  in  the  state  diagram  used  to  visualize  the  process;  an  arrow  leaving  one  state  can 

either  go  to  another  state  or  return  to  the  same  state.  Let  be  the  probability  of  the 

i'th  state  being  next  if  the  present  state  is  Sj,  and  let  r^.  (k  »  0, 1)  be  the  probability  of 

the  symbol  k  being  generated  by  the  source  if  it  is  presently  in  state  Sj.  It  follows  that 

p. .  is  either  zero,  equal  to  one  of  the  r.  ,,  or  equal  to  unity.  It  will  be  assumed  that 
(15)  KJ 

the  source  is  ergodic  so  that  a  unique  set  of  state  probabilities  P-  exists  satisfying 


m 

^  Py  pj  =  pi  i  »  1,  2  ...  m  (3.1) 

jal 

The  encoding  process  will  be  described  by  a  finite  state  transducer  having  states  R^, 
R.,  ...  R  .  Again  two  arrows  will  leave  each  state,  one  for  a  0  symbol  from  the 
source  and  the  other  for  1.  Associated  with  each  arrow  is  either  a  channel  word  of 
finite  length,  or  no  word.  The  combination  of  source  and  encoder  can  now  be  regarded 
asaa  Markov  process  with  mn  states  S^Rj.  The  transition  probabilities  of  the  combina¬ 
tion  process  will  be  denoted  by  q^,  and  the  state  probabilities  will  satisfy 


mn 

2 


j-i 


Qj  "  Qi 


i  *  1,  2,  . . . 


mn 


(3.2) 
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Tk®q^  k®  *ero,  or  equal  to  one  of  the  p^,  or  unity.  The  combination  pro- 

ceee  may  not  be  ergo  die,  so  that  there  may  be  several  solutions  for  Q.  There  will  also 
be  a  set  of  probabilities  t^Oc  *  0, 1)  representing  the  probability  that  the  channel  word 
corresponding  to  symbol  k  being  fed  to  the  composite  process  while  in  state  j  will  be 
generated.  Again,  can  be  zero,  equal  to  one  of  the  tj^,  or  unity.  If  the  channel 
word  which  is  generated  when  k  is  fed  to  the  encoder  has  symbols,  then  the  expected 
length  of  a  channel  word  for  a  single  state  transition  of  the  composite  process  is 


mn  1 

L  =  kj  1kj  (3.3) 

i=l  k=0 

provided  that  the  Qj  are  uniquely  determined  by  Eq.  3.  2.  The  compression  C  is  the 
reciprocal  of  L,  since  every  state  transition  in  the  composite  process  corresponds  to 
a  single  symbol  from  the  source. 

The  method  of  calculation  will  be  illustrated  by  encoding  a  *  woodcut  source1 
with  the  code 


00  -*0 
01-*-10 
1  —11 

A  *  woodcut  source*  will  be  defined  as  a  binary  source  in  which  the  probability  of  a  0 
is  1-  P  if  the  preceding  symbol  was  1  and  1-a  if  the  preceding  symbols  was  0,  and  in 
which  the  probability  of  a  1  is  a  if  the  preceding  symbol  was  0,  and  in  which  the  proba¬ 
bility  of  a  1  is  a  if  the  preceding  symbol  was  0  and  P  if  the  preceding  symbol  was  0.  This 
source,  the  code,  and  the  composite  process  are  illustrated  in  Fig.  3.  3.  If  a  andl-  p 
are  small  this  source  will  generate  long  strings  of  O'  s  and  l1  s,  and  this  is  characteristic 
of  a  scanned  block  srawing  or  woodcut.  If  a  *  p  this  is  the  usual  binary  source  with 
independent  symbols.  Outputs  from  either  source  or  encoder  are  shown  in  parenthesis 
The  stationary  probabilities  of  the  composite  process  must  satisfy 


P  a  a  p 

"Qi  ~ 

°i 

1-p  0  1-a  0 

°2 

°2 

0  1-a  0  1-p 

°3 

q3 

1 - 

O 

o 

o 

o 

L_ 

kJ 

kJ 

4 


The  unique  solution,  with  the  constraint  Q  =  1,  is  : 

i«l  ■ 
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Note  that  S^Rg  is  a  transient  state  which  can  never  be  reoccupied,  and  therefore  »  0. 
The  average  length  L  can  now  be  calculated  from  Eq.  3.  3  as 


-L  «  L  =  1  +  3a  -  2a2  +  aft  -  P 
C  (2  -  a)  0  +  a  -  P) 


A  different  result  is  obtained  if  the  expected  length  of  the  output  word  is  divided  by  the 
expected  length  of  the  input  word: 

w 

Z  fWr, 

L.  .  Si -  (3. 5) 

w 

2  p(*>  *r 

r=l 


where  p(r)  is  the  probability  of  word  r  in  the  code  table,  y^  is  the  length  of  the  r'th  out¬ 
put  word,  xy  the  length  of  the  r'th  input  word,  and  w  the  number  of  words.  In  the  pre¬ 
sent  example,  the  probability  of  being  in  is  y ^3.—  ^  and  the  probability  of  being  in 
Sj  is  ^  •p  •  therefore: 

pU)«  Pr  foo}*  — - —  a  -  a)  a  -  p)  +~ l-'-g —  a  -  <4 2 

-  *■  >  l+a-p  1  +  a-  P 


»  ■  »>  a  -p) 

1  +  u  -  p 


s  —  _ cj- 

l  +  a-p 


p(2)-Pr  {0l} 
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This  is  not  the  same  as  L>.  This  shows,  among  other  things,  that  Huffman*  s  method 
for  deciding  on  the  channel  word  set  will  not  give  the  most  efficient  code  even  if  the 
source  word  set  is  kept  fixed.  The  reason  is  that  the  occurrence  of  various  source 
words  are  not  statistically  independent  events.  Smallness  of  the  is  not  an  indication 
of  the  codes  efficientcy  unless  the  words  occur  independently. 


Conclusions  and  Program  for  Next  Interval 


It  is  expected  that  the  methods  for  analyzing  Turing  machines  and  computers  can 
be  used  to  lead  to  more  definite  conclusions  as  the  the  efficiency  of  realizing  logical 
functions  with  sequenctial  machines.  The  four  state  machines  will  be  tabulated  and 
classified  according  to  their  properties.  An  attempt  will  be  made  to  derive  necessary 
and  sufficient  conditions  for  resettability  by  extending  the  classes  of  non-resettable 
machines  considered  above. 
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